extrapolation error
Trustworthy Feature Importance Avoids Unrestricted Permutations
Borgonovo, Emanuele, Cappelli, Francesco, Lu, Xuefei, Plischke, Elmar, Rudin, Cynthia
Since their introduction by Breiman (2001), permutation-based feature importance measures have been widely adopted. However, randomly permuting the entries of a dataset may create new points far from the original data or even "impossible data." In a permuted dataset, we may find children who are retired or individuals who graduated from high school before they were born (Mase et al. 2022, p. 1). Forcing ML models to make predictions at these points causes them to extrapolate, making explanations unreliable (Hooker et al. 2021). Every non-trivial permutation-based variable importance measure, including SHAP (Lundberg and Lee 2017), Knockoffs (Barber and Candรฉs 2015), conditional model reliance (Fisher et al. 2019), and accumulated local effect (ALE) plots (Apley and Zhu 2020) suffer from this. We propose and compare three new strategies to address extrapolation issues. The first combines conditional model reliance from Fisher et al. (2019) with a Gaussian transformation. By mapping data quantiles to a Gaussian distribution and back, we adjust only the quantiles of point values, significantly reducing extrapolation. Under a Gaussian copula assumption for the feature distribution, we prove that the new data points follow the same probability distribution as the original data.
Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning
Learning from datasets without interaction with environments (Offline Learning) is an essential step to apply Reinforcement Learning (RL) algorithms in real-world scenarios.However, compared with the single-agent counterpart, offline multi-agent RL introduces more agents with the larger state and action space, which is more challenging but attracts little attention. We demonstrate current offline RL algorithms are ineffective in multi-agent systems due to the accumulated extrapolation error. In this paper, we propose a novel offline RL algorithm, named Implicit Constraint Q-learning (ICQ), which effectively alleviates the extrapolation error by only trusting the state-action pairs given in the dataset for value estimation. Moreover, we extend ICQ to multi-agent tasks by decomposing the joint-policy under the implicit constraint. Experimental results demonstrate that the extrapolation error is successfully controlled within a reasonable range and insensitive to the number of agents. We further show that ICQ achieves the state-of-the-art performance in the challenging multi-agent offline tasks (StarCraft II). Our code is public online at https://github.com/YiqinYang/ICQ.
Frictional Q-Learning
We draw an analogy between static friction in classical mechanics and extrapolation error in off-policy RL, and use it to formulate a constraint that prevents the policy from drifting toward unsupported actions. In this study, we present Frictional Q-learning, a deep reinforcement learning algorithm for continuous control, which extends batch-constrained reinforcement learning. Our algorithm constrains the agent's action space to encourage behavior similar to that in the replay buffer, while maintaining a distance from the manifold of the orthonormal action space. The constraint preserves the simplicity of batch-constrained, and provides an intuitive physical interpretation of extrapolation error. Empirically, we further demonstrate that our algorithm is robustly trained and achieves competitive performance across standard continuous control benchmarks.